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Abstract — A  neural  network  based  on  Wilson— Cowan  oscilla¬ 
tors  is  used  to  perform  object  recognition  in  a  two-dimensional 
visual  scene.  The  temporal  correlation  among  groups  of  oscil¬ 
lating  neurons  is  used  as  the  main  criterion  to  solve  the  classic 
binding  and  segmentation  problem.  The  network  uses  an  orig¬ 
inal  pattern  of  short-range  lateral  excitations  among  adjacent 
neurons  to  achieve  the  binding  problem,  and  an  external  in¬ 
hibitory  global  neuron  to  provide  segmentation  of  multiple  ob¬ 
jects  in  the  same  visual  scene.  The  latter  may  represent  an  ”  at¬ 
tention  mechanism ”  from  neurons  at  a  higher  hierarchical  level. 
Simulations  performed  by  using  multiple  idealized  figures  (up 
to  4—5)  in  the  presence  of  noise  suggest  that  the  network  can 
satisfactorily  recognize  objects  in  most  cases.  However,  the 
threshold  and  time  constant  of  the  attention  mechanism  de¬ 
pend  on  the  complexity  (number  of  objects  and  level  of  noise) 
of  the  scene  under  examination.  The  present  results  may  be 
useful  to  improve  our  understanding  of  how  distributed  activ¬ 
ities  are  integrated  in  the  neural  system  to  form  single  object 
perceptions.  In  perspective,  the  proposed  model  may  find  ap¬ 
plications  in  practical  algorithms  for  object  recognition. 
Keywords  —  Image  segmentation,  oscillatory  neurons,  object 
recognition. 

I.  Introduction 

A  fundamental  task  that  the  brain  ordinarily  solves  in  daily 
life  is  the  segmentation  of  the  visual  scene  into  a  set  of  dis¬ 
tinct  objects.  This  task  requires  the  simultaneous  solution 
of  two  complementary  problems:  a  ”  binding ”  problem,  which 
consists  in  assembling  the  common  attributes  of  a  single  ob¬ 
ject  into  a  unique  figure,  and  the  ”  segmentation”  problem, 
which  requires  the  visual  scene  to  be  decomposed  in  distinct 
figures,  avoiding  attributes  of  different  objects  to  be  grouped 
together.  It  is  generally  assumed  that  binding  and  segmenta¬ 
tion  of  the  visual  scene  is  based  on  the  principles  of  Gestalt 
psychology,  such  as  proximity,  similarity,  common  fate,  con¬ 
nectedness,  good  continuation,  etc..  [1],  [2],  [3],  [4]. 

Despite  the  fact  that  the  brain  easily  solves  the  binding  and 
segmentation  problem,  its  theoretical  solution  is  still  arduous 
using  artificial  neural  networks.  A  traditional  hypothesis  is 
that  information  carrying  different  attributes  of  a  same  object 
converges  to  neurons  at  a  higher  hierarchical  level.  These 
neurons,  in  turn,  respond  selectively  only  to  those  groups  of 
features  which  characterize  a  single  object  (grandmother  cell 
representation).  This  assumption,  however,  involves  many 
theoretical  and  neurophysiological  problems,  and  it  is  usually 
rejected  today. 

A  second,  recent  theory  assumes  that  binding  and  segmen¬ 
tation  are  accomplished  by  the  brain  on  the  basis  of  tempo¬ 
ral  correlation  between  neural  activities.  Accordingly,  neu¬ 
rons  that  fire  in  phase  would  signal  common  attributes  of  the 
same  object.  This  hypothesis  is  supported  by  experiments 
in  anesthetized  cats  and  monkeys  [1],  [2].  In  these  studies 
stimuli  which,  according  to  the  Gestalt  theory,  would  belong 
t°  activity  in  groups  of 

neurons.  Conversely,  stimuli  that  belong  to  different  figures 


fail  to  induce  synchronized  responses. 

In  order  to  explore  the  previous  aspects  theoretically,  a 
few  models  of  oscillating  neurons  have  been  used  in  recent 
years  [3],  [4],  [5],  [6],  [7].  These  models  suggest  that  binding 
and  segmentation  can  be  achieved  using  lateral  connections 
between  groups  of  oscillating  neurons.  However,  many  prob¬ 
lems,  especially  concerning  segmentation  of  multiple  objects 
in  the  same  scene,  are  still  existent. 

Aim  of  this  work  is  to  use  an  original  neural  network,  based 
on  Wilson-Cowan  oscillators,  to  analyze  the  binding  and  seg¬ 
mentation  problem  in  a  two-dimensional  visual  scene  in  pres¬ 
ence  of  noise.  Original  lateral  connections  are  used  to  impose 
synchronism  between  neurons,  while  an  attention  mechanism 
is  proposed  to  achieve  segmentation.  A  few  examples  are 
presented  and  discussed. 

II.  System  description 

The  model  of  a  single  oscillator  consists  of  a  feedback  loop 
between  an  excitatory  unit  x^j  and  an  inhibitory  unit  yij. 
The  time  derivatives  are  defined  as: 

(t)  =  -Xij (/)  +  H(Xij (/)  3-  ytj (7) 

+Sij  +  lij  +  g  —  ipx  —  z(t)) 

<  (1) 

ftVij (t)  =  “7  ■  Vio  W  +  H(a  ■  Xij (t)  -  <py) 

,  +SV 

where  i  and  j  represent  the  position  of  the  oscillator  within  a 
two-dimensional  network  (1  <  i  <  TV;  1  <  j  <  M),  and: 

H{v)  = - W  .  (2) 

1  +  e  T 

Equations  (1),  (2)  describe  essentially  a  simplified  Wilson- 
Cowan  oscillator.  This  oscillator  model  can  be  biologically 
interpreted  as  a  mean  field  approximation  of  a  network  of 
excitatory  and  inhibitory  neurons.  The  parameters  have  the 
following  meaning:  a  and  (3  are  positive  parameters  describ¬ 
ing  the  coupling  between  two  units,  particularly  a  influences 
the  amplitude  of  oscillations;  /  represents  external  stimula¬ 
tion;  q  denotes  a  noise  term.  H (y)  is  a  sigmoid  activation 
function  with  thresholds  ipx  and  ipy,  for  excitatory  and  in¬ 
hibitory  unit  respectively.  T  affects  the  central  slope  of  the 
sigmoidal  relationship,  and  7  is  inversely  proportional  to  the 
time  constant  of  the  inhibitory  units,  hence  it  controls  the  fre¬ 
quency  of  oscillations.  z(t)  represents  the  activity  of  a  Global 
Separator  GS  which  will  be  specified  later  on.  Sij  represents 
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Fig.  1.  (a)  A  diagram  showing  the  interactions  between  the  excitatory 

and  the  inhibitory  units  in  a  chain  of  oscillators.  (A)  represent  excita¬ 
tory  connections;  (o)  represent  inhibitory  connections.  Each  excitatory 
unit  is  connected  to  the  excitatory  unit  (solid  line)  and  to  the  inhibitory 
unit  (dashed  line)  of  each  nearest-neighbor  oscillator,  (b)  Schematic 
representation:  the  neural  oscillators  in  black  circles  are  connected  only 
to  neural  oscillators  in  grey  circles.  GS  represents  the  Global  Separator. 


a  term  of  coupling  that  we  define  as: 


Si:j=8-W • 


l  l 

^A  ^A  Fijhk  x(i+h)(j+k ) 
h=— 1 k=— 1 

T  I 

Y  Y  Fijhk 

h=— 1 k=—l 


where: 


Fijhk  —  < 


if  h  =  k  =  0 

or  i  Jrh>  N  or  i  +  h  <  1 
or  j  +  k  >  M  or  j  +  k  <  1 


otherwise 


(3) 


(4) 


Fijhk  is  a  normalization  factor  for  the  connection  weight  W: 
in  (3)  the  connections  weights  are  opportunely  normalized 
in  relation  to  the  oscillator  position,  which  determines  the 
number  of  neurons  involved  in  coupling.  In  this  way,  the 
model  respects  an  isotropy  property  (fig.l,  fig. 2).  The  terms 
Sij  in  H(v)  implement  the  coupling  between  the  excitatory 
unit  of  an  oscillator  and  the  excitatory  units  of  its  nearest- 
neighbors.  The  same  terms  Sij  in  the  equation  of  inhibitory 
unit  implement  the  coupling  between  the  excitatory  unit  and 
the  inhibitory  units  of  its  nearest-neighbors:  both  coupling 
are  excitatory.  The  second  type  of  coupling,  which  was  not 
used  in  previous  models,  is  essential  to  improve  synchroniza¬ 
tion  of  neurons  within  a  same  object. 

The  local  excitatory  connections  assumed  in  our  model  con¬ 
form  with  various  lateral  connections  in  the  brain,  in  partic¬ 
ular  they  could  be  interpreted  as  the  horizontal  connections 
in  the  visual  cortex.  Our  simulations  have  revealed  that, 
although  the  lateral  excitatory  couplings  allow  fast  synchro¬ 
nization  of  all  oscillators  excited  by  a  same  stimulus,  they  do 
not  permit  satisfactory  desynchronization  of  oscillators  ex¬ 
cited  by  different  objects.  For  this  reason  we  introduced  a 


Fig.  2.  Architecture  of  a  two-dimensional  network  with  eight  nearest- 
neighbor  excitatory  coupling.  An  oscillator  is  indicated  by  a  circle.  The 
circle  labeled  GS  represents  the  Global  Separator.  Three  situations  of 
connection  are  shown:  each  black  circle  is  connected  only  to  adjacent 
grey  circles. 


Global  Separator  GS  through  which  we  want  to  simulate  a 
mechanism  of  attention.  GS  is  an  inhibitory  interneuron  de¬ 
fined  as: 


d  ,  x 


z(t ))  —  S  •  z(t). 


(5) 


where: 


N  M 


if 

i=l  j=l 


0  otherwise 


(6) 


a  is  a  binary  value  and  d  controls  the  level  of  activity  of  the 
entire  network.  The  positive  parameters  S  and  (p  control  the 
rate  of  growth  and  decay  of  z(t),  therefore  the  segmentation 
capacity  of  GS.  We  can  speculate  that  cr,  8  and  p  fix  the 
degree  of  attention  of  GS.  GS  receives  excitatory  input  from 
the  entire  network  and  sends  inhibition  to  all  oscillators:  this 
long-connections  give  rise  to  desynchronization  (fig.l,  fig. 2). 
The  inhibitory  input  is  generated  whenever  in  the  neural  grid 
there  are  some  fairly  active  regions:  only  the  neurons  with 
enough  activity  will  survive  to  inhibition  continuing  to  oscil¬ 
late. 

Psychologically  we  know  that  the  thalamic  reticular  com¬ 
plex  may  be  involved  in  the  global  control  of  selective  atten¬ 
tion:  it  receives  input  from  and  sends  projection  to  almost 
the  entire  cortex.  The  activity  of  GS  should  be  interpreted 
as  the  collective  behavior  of  the  neural  group  in  the  thalamic 
reticular  complex. 


III.  Simulation  results 

To  illustrate  how  our  network  is  used  for  image  segmenta¬ 
tion  we  have  simulated  a  15x15  and  a  15x20  grid  of  neural 
oscillators  with  a  Global  Separator.  In  the  first  simulation 
we  map  two  objects  (designed  as  the  sun  and  a  car):  for 
all  stimulated  oscillators  7=0.8,  for  the  others  7=0.  The  im¬ 
age  has  been  corrupted  by  means  of  an  uniformly-distributed 
random  noise.  For  the  background  the  uniformly-distribution 
has  mean  equal  to  0.1,  while  for  the  stimulated  oscillators  has 
mean  equal  to  0.  The  variance  is  equal  to  0.003.  The  set  of 
ordinary  differential  equations  has  been  numerically  solved  on 
PENTlUM-based  personal  computers,  using  the  fourth-order 
Runge-Kutta  integration  method  with  random  initial  condi¬ 
tions. 
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Fig.  3.  The  first  picture  represents  the  noisy  input,  on  a  noisy  background,  used  for  the  network.  The  image  is  mapped  to  a  15X15  network  and 
each  red  square  denotes  a  single  oscillator  that  receives  input.  Each  following  picture  represents  network  activity  at  a  time  step  in  the  numerical 
simulation.  We  name  the  objects  as  follows:  a  small  car,  a  sun.  The  parameters  are:  a  =  0.25,  (3  =  2.5,  px  —  0.7,  tpy  —  0.15,  7  =  1,  W=  1, 
Iij  =  0.8,  T=0.025  and  for  GS :  $  =  1.8,  (p  =  2,  S  =  4.5.  The  simulation  took  2000  integration  step. 


Fig.  4.  A  noisy  image,  on  a  noisy  background,  composed  of  four  patterns  mapped  to  a  15X20  network.  The  objects  composing  the  image  represent 
a  small  car,  a  garage,  a  sun  and  a  cloud.  The  first  picture  represents  the  noisy  input  and  each  following  picture  represents  network  activity  at  a 
time  step  in  the  numerical  simulation.  The  parameters  for  GS  are:  $  =  2,  <p  =  2,  5  =  8.  The  other  parameters  are  as  specified  in  the  captions  of 
fig. 3  except  7  =  0.5.  The  simulation  took  2000  integration  step. 


Fig. 3  shows  the  instantaneous  activity  (snapshot)  of  each 
neuron  of  the  network  at  various  stages  of  dynamic  evolu¬ 
tion.  A  short  time  after  the  input  is  applied,  we  can  observe 
the  clear  effect  of  synchronization  and  desynchronization:  the 
noisy  sun  is  segmented  from  the  noisy  car  and  from  noisy 
background;  then  each  object  ” fires”  in  a  periodic  way,  sepa¬ 
rately  from  the  others.  Most  parameters  in  (1),  (2),  (5),  (6) 
are  intrinsic  to  neural  network  and  need  not  to  be  changed 
after  they  have  been  appropriately  chosen.  Only  the  param¬ 
eters  concerning  GS  and  the  oscillation  frequency  (7)  need  to 


be  tuned  for  applications.  In  fact,  with  a  fixed  set  of  param¬ 
eters,  the  dynamical  system  can  segment  only  a  limited  num¬ 
ber  of  patterns.  This  number  depends  on  the  ratio  between 
the  time  that  a  single  oscillator  spends  in  the  silent  phases 
and  the  time  that  it  spends  in  the  active  phases  (segmentation 
capacity).  Furthermore  GS  requires  different  degrees  of  at¬ 
tention.  More  particularly,  in  order  to  separate  correctly  the 
entire  image,  during  each  oscillation  period  of  the  network  GS 
must  be  able  to  generate  as  many  inhibitory  impulses  as  the 
number  of  objects  to  be  separated.  To  illustrate  this  point, 
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Fig.  5.  Temporal  evolution  of  every  stimulated  oscillator  in  the  sim¬ 
ulation  of  fig. 4.  The  upper  four  traces  show  the  combined  x  activities 
of  four  oscillators  blocks  representing  the  four  corresponding  objects  in¬ 
dicated  by  their  respective  labels.  The  fifth  trace  shows  the  activity  of 
the  Global  Separator. 


we  stimulated  a  15x20  network  with  an  arbitrary  image  con¬ 
taining  four  objects  designed  as  a  car,  a  garage,  a  cloud  and 
the  sun  (fig. 4).  In  this  case  for  the  correct  segmentation  we 
had  to  use  a  greater  degree  of  attention  for  GS  just  because 
the  image  is  more  complex:  the  parameters  in  simulations 
of  fig. 3  and  fig. 4  are  the  same  except  for  7  and  GS1.  The 
necessity  to  modify  the  parameters  of  the  GS  was  also  ob¬ 
served  by  Wang  and  Terman  [3] ,  although  these  authors  used 
a  different  mechanism  for  global  inhibition.  Fig. 5  shows  the 
temporal  evolution  of  the  oscillators  stimulated  by  each  noisy 
object.  The  four  upper  traces  represent  the  activities  of  four 
oscillators  blocks  and  the  bottom  trace  represents  the  activ¬ 
ity  of  GS.  The  synchrony  within  each  block  and  desynchrony 
between  different  blocks  are  achieved  after  a  few  cycles. 

IV.  Discussion 

The  present  study  introduces  some  new  aspects  compared 
with  previous  models:  i)  we  used  an  original  pattern  of  lateral 
connections  among  groups  of  oscillating  neurons.  In  partic¬ 
ular,  the  presence  of  a  short-range  excitatory  connection  be¬ 
tween  the  excitatory  neurons,  Xij,  and  the  adjacent  inhibitory 
neurons,  allowed  the  attainment  of  robust  synchroniza¬ 
tion  in  a  large  variety  of  visual  scenes,  even  in  the  presence 
of  noise.  Without  this  connection,  synchronization  can  be 
achieved  only  with  difficulty  in  many  cases,  ii)  Segmentation, 
in  the  presence  of  multiple  objects,  cannot  be  achieved  with 
the  use  of  short-range  lateral  connections  only,  but  it  requires 
the  presence  of  an  external  inhibitory  neuron.  The  same  con¬ 
clusion  was  reached  by  Wang  et  al  too  [3]  through  the  use 
of  a  different  mechanism  of  global  inhibition.  The  external 
inhibitory  neuron  sends  an  inhibitory  signal  to  all  neurons 

xFor  the  parameter  values  see  the  captions  of  fig. 3  and  fig. 4 


in  the  network,  as  soon  as  their  global  activity  overcomes  a 
given  threshold.  This  global  inhibition,  in  turn,  allows  only 
neurons  belonging  to  a  single  object  (i.e.,  the  object  with 
present  maximal  excitation)  to  fire,  thus  realizing  segmenta¬ 
tion  in  the  same  scene.  It  is  interesting  to  observe  that  the 
global  inhibition  proposed  in  this  work  can  simulate  an  ”  at¬ 
tention  mechanism ” ,  originating  from  neurons  in  the  cerebral 
cortex  at  a  higher  hierarchical  level.  Accordingly,  we  observed 
that  segmentation  of  a  different  number  of  objects,  and/or 
the  use  of  a  different  noise  level  requires  a  modification  in 
the  threshold  and  time  constant  of  the  attention  mechanism. 
This  result  agrees  with  the  idea  that  a  complex,  noised  visual 
scene  asks  for  more  attention  to  be  correctly  perceived,  while 
lower  attention  is  requested  in  the  perception  of  a  few  objects 
in  a  noise-free  ambient.  At  any  given  level  of  attention,  just 
a  limited  number  of  distinct  objects  can  be  separately  identi¬ 
fied.  This  important  property  of  the  system  agrees  with  the 
well-known  psychological  principle  that  there  are  fundamen¬ 
tal  limits  on  the  number  of  simultaneously  perceived  objects, 
iii)  An  important  feature  of  the  present  system  is  the  capac¬ 
ity  to  recognize  objects  even  in  the  presence  of  a  strong  noise 
superimposed  on  the  visual  scene.  In  particular,  we  observed 
that  noise  superimposed  on  the  background  causes  only  a 
mild  deterioration  in  the  system  performance.  When  noise 
is  superimposed  directly  on  the  object,  the  system  can  still 
solve  the  binding  and  segmentation  problem  in  most  cases, 
even  when  4  or  5  objects  are  simultaneously  present. 

V.  Conclusions 

In  conclusion,  the  present  work  suggests  that  the  binding 
and  segmentation  problem  can  be  performed  by  using  tempo¬ 
ral  correlation  among  neuron  activities.  The  solution  of  the 
problem  benefits  from  short-range  lateral  connections  among 
groups  of  oscillating  neurons,  and  from  the  presence  of  an  ex¬ 
ternal  attention  mechanism.  The  mathematical  form  of  both 
mechanisms  is  original  compared  with  previous  studies.  The 
present  results  may  be  useful  to  improve  our  understanding  of 
how  distributed  activities  are  integrated  in  the  neural  system 
to  form  single  object  perceptions.  Moreover,  the  proposed 
model  may  find  applications  in  practical  algorithms  for  ob¬ 
ject  recognition. 
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